Overview

Dataset statistics

Number of variables20
Number of observations34857
Missing cells93365
Missing cells (%)13.4%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory5.3 MiB
Average record size in memory160.0 B

Variable types

Categorical8
Numeric12

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
Suburb has a high cardinality: 351 distinct values High cardinality
Address has a high cardinality: 34009 distinct values High cardinality
SellerG has a high cardinality: 388 distinct values High cardinality
Date has a high cardinality: 78 distinct values High cardinality
Rooms is highly correlated with Bedroom2 and 2 other fieldsHigh correlation
Postcode is highly correlated with Lattitude and 1 other fieldsHigh correlation
Bedroom2 is highly correlated with Rooms and 2 other fieldsHigh correlation
Bathroom is highly correlated with Rooms and 2 other fieldsHigh correlation
BuildingArea is highly correlated with Rooms and 2 other fieldsHigh correlation
Lattitude is highly correlated with PostcodeHigh correlation
Longtitude is highly correlated with PostcodeHigh correlation
Rooms is highly correlated with Bedroom2 and 1 other fieldsHigh correlation
Bedroom2 is highly correlated with Rooms and 1 other fieldsHigh correlation
Bathroom is highly correlated with Rooms and 1 other fieldsHigh correlation
Rooms is highly correlated with Bedroom2 and 2 other fieldsHigh correlation
Bedroom2 is highly correlated with Rooms and 2 other fieldsHigh correlation
Bathroom is highly correlated with Rooms and 2 other fieldsHigh correlation
BuildingArea is highly correlated with Rooms and 2 other fieldsHigh correlation
Regionname is highly correlated with CouncilAreaHigh correlation
CouncilArea is highly correlated with RegionnameHigh correlation
Rooms is highly correlated with Type and 2 other fieldsHigh correlation
Type is highly correlated with Rooms and 1 other fieldsHigh correlation
Distance is highly correlated with Postcode and 5 other fieldsHigh correlation
Postcode is highly correlated with Distance and 4 other fieldsHigh correlation
Bedroom2 is highly correlated with Rooms and 2 other fieldsHigh correlation
Bathroom is highly correlated with Rooms and 1 other fieldsHigh correlation
Landsize is highly correlated with BuildingAreaHigh correlation
BuildingArea is highly correlated with LandsizeHigh correlation
CouncilArea is highly correlated with Distance and 5 other fieldsHigh correlation
Lattitude is highly correlated with Distance and 5 other fieldsHigh correlation
Longtitude is highly correlated with Distance and 5 other fieldsHigh correlation
Regionname is highly correlated with Distance and 4 other fieldsHigh correlation
Propertycount is highly correlated with Distance and 3 other fieldsHigh correlation
Bedroom2 has 8217 (23.6%) missing values Missing
Bathroom has 8226 (23.6%) missing values Missing
Car has 8728 (25.0%) missing values Missing
Landsize has 11810 (33.9%) missing values Missing
BuildingArea has 21115 (60.6%) missing values Missing
YearBuilt has 19306 (55.4%) missing values Missing
Lattitude has 7976 (22.9%) missing values Missing
Longtitude has 7976 (22.9%) missing values Missing
Landsize is highly skewed (γ1 = 96.02231136) Skewed
BuildingArea is highly skewed (γ1 = 99.13257937) Skewed
Address is uniformly distributed Uniform
Car has 1631 (4.7%) zeros Zeros
Landsize has 2437 (7.0%) zeros Zeros

Reproduction

Analysis started2021-10-09 14:05:54.930704
Analysis finished2021-10-09 14:06:09.408423
Duration14.48 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Suburb
Categorical

HIGH CARDINALITY

Distinct351
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
Reservoir
 
844
Bentleigh East
 
583
Richmond
 
552
Glen Iris
 
491
Preston
 
485
Other values (346)
31902 

Length

Max length18
Median length9
Mean length9.819175488
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)0.1%

Sample

1st rowAbbotsford
2nd rowAbbotsford
3rd rowAbbotsford
4th rowAbbotsford
5th rowAbbotsford

Common Values

ValueCountFrequency (%)
Reservoir844
 
2.4%
Bentleigh East583
 
1.7%
Richmond552
 
1.6%
Glen Iris491
 
1.4%
Preston485
 
1.4%
Kew467
 
1.3%
Brighton456
 
1.3%
Brunswick444
 
1.3%
South Yarra435
 
1.2%
Hawthorn428
 
1.2%
Other values (341)29672
85.1%

Length

2021-10-09T16:06:09.464682image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
east2735
 
5.7%
north1795
 
3.7%
south1398
 
2.9%
west1084
 
2.3%
melbourne1053
 
2.2%
bentleigh902
 
1.9%
park885
 
1.8%
brunswick877
 
1.8%
brighton849
 
1.8%
reservoir844
 
1.8%
Other values (296)35685
74.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Address
Categorical

HIGH CARDINALITY
UNIFORM

Distinct34009
Distinct (%)97.6%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
5 Charles St
 
6
25 William St
 
4
2 Bruce St
 
3
13 George St
 
3
3 Charles St
 
3
Other values (34004)
34838 

Length

Max length27
Median length13
Mean length13.55136701
Min length8

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33201 ?
Unique (%)95.2%

Sample

1st row68 Studley St
2nd row85 Turner St
3rd row25 Bloomburg St
4th row18/659 Victoria St
5th row5 Charles St

Common Values

ValueCountFrequency (%)
5 Charles St6
 
< 0.1%
25 William St4
 
< 0.1%
2 Bruce St3
 
< 0.1%
13 George St3
 
< 0.1%
3 Charles St3
 
< 0.1%
57 Bay Rd3
 
< 0.1%
28 Blair St3
 
< 0.1%
3 Donald St3
 
< 0.1%
38 Stewart St3
 
< 0.1%
21 May St3
 
< 0.1%
Other values (33999)34823
99.9%

Length

2021-10-09T16:06:09.542089image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
st17238
 
16.4%
rd6592
 
6.3%
av3395
 
3.2%
ct1743
 
1.7%
dr1266
 
1.2%
cr1171
 
1.1%
gr733
 
0.7%
3695
 
0.7%
5671
 
0.6%
4656
 
0.6%
Other values (12873)70896
67.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Rooms
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.031012422
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:09.604051image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12
median3
Q34
95-th percentile5
Maximum16
Range15
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9699329349
Coefficient of variation (CV)0.320002956
Kurtosis2.511708654
Mean3.031012422
Median Absolute Deviation (MAD)1
Skewness0.4990968808
Sum105652
Variance0.9407698982
MonotonicityNot monotonic
2021-10-09T16:06:09.659071image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
315084
43.3%
28332
23.9%
47956
22.8%
51737
 
5.0%
11479
 
4.2%
6204
 
0.6%
732
 
0.1%
819
 
0.1%
106
 
< 0.1%
94
 
< 0.1%
Other values (2)4
 
< 0.1%
ValueCountFrequency (%)
11479
 
4.2%
28332
23.9%
315084
43.3%
47956
22.8%
51737
 
5.0%
6204
 
0.6%
732
 
0.1%
819
 
0.1%
94
 
< 0.1%
106
 
< 0.1%
ValueCountFrequency (%)
161
 
< 0.1%
123
 
< 0.1%
106
 
< 0.1%
94
 
< 0.1%
819
 
0.1%
732
 
0.1%
6204
 
0.6%
51737
 
5.0%
47956
22.8%
315084
43.3%

Type
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
h
23980 
u
7297 
t
3580 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowh
2nd rowh
3rd rowh
4th rowu
5th rowh

Common Values

ValueCountFrequency (%)
h23980
68.8%
u7297
 
20.9%
t3580
 
10.3%

Length

2021-10-09T16:06:09.722150image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-09T16:06:09.760638image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
h23980
68.8%
u7297
 
20.9%
t3580
 
10.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Method
Categorical

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
S
19744 
SP
5095 
PI
4850 
VB
3108 
SN
 
1317
Other values (4)
 
743

Length

Max length2
Median length1
Mean length1.428608314
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSS
2nd rowS
3rd rowS
4th rowVB
5th rowSP

Common Values

ValueCountFrequency (%)
S19744
56.6%
SP5095
 
14.6%
PI4850
 
13.9%
VB3108
 
8.9%
SN1317
 
3.8%
PN308
 
0.9%
SA226
 
0.6%
W173
 
0.5%
SS36
 
0.1%

Length

2021-10-09T16:06:09.807369image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-09T16:06:09.852970image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
s19744
56.6%
sp5095
 
14.6%
pi4850
 
13.9%
vb3108
 
8.9%
sn1317
 
3.8%
pn308
 
0.9%
sa226
 
0.6%
w173
 
0.5%
ss36
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

SellerG
Categorical

HIGH CARDINALITY

Distinct388
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
Jellis
3359 
Nelson
3236 
Barry
3235 
hockingstuart
2623 
Marshall
 
2027
Other values (383)
20377 

Length

Max length27
Median length6
Mean length6.291533982
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique107 ?
Unique (%)0.3%

Sample

1st rowJellis
2nd rowBiggin
3rd rowBiggin
4th rowRounds
5th rowBiggin

Common Values

ValueCountFrequency (%)
Jellis3359
 
9.6%
Nelson3236
 
9.3%
Barry3235
 
9.3%
hockingstuart2623
 
7.5%
Marshall2027
 
5.8%
Ray1950
 
5.6%
Buxton1868
 
5.4%
Biggin897
 
2.6%
Fletchers861
 
2.5%
Woodards714
 
2.0%
Other values (378)14087
40.4%

Length

2021-10-09T16:06:09.924425image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
jellis3359
 
9.6%
nelson3236
 
9.3%
barry3235
 
9.3%
hockingstuart2623
 
7.5%
marshall2027
 
5.8%
ray1950
 
5.6%
buxton1868
 
5.4%
biggin897
 
2.6%
fletchers861
 
2.5%
woodards714
 
2.0%
Other values (373)14087
40.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Date
Categorical

HIGH CARDINALITY

Distinct78
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size272.4 KiB
28/10/2017
 
1119
17/03/2018
 
970
24/02/2018
 
941
9/12/2017
 
927
25/11/2017
 
902
Other values (73)
29998 

Length

Max length10
Median length10
Mean length9.714748831
Min length9

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3/09/2016
2nd row3/12/2016
3rd row4/02/2016
4th row4/02/2016
5th row4/03/2017

Common Values

ValueCountFrequency (%)
28/10/20171119
 
3.2%
17/03/2018970
 
2.8%
24/02/2018941
 
2.7%
9/12/2017927
 
2.7%
25/11/2017902
 
2.6%
18/11/2017866
 
2.5%
3/03/2018846
 
2.4%
6/01/2018787
 
2.3%
27/05/2017770
 
2.2%
23/09/2017742
 
2.1%
Other values (68)25987
74.6%

Length

2021-10-09T16:06:09.987042image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
28/10/20171119
 
3.2%
17/03/2018970
 
2.8%
24/02/2018941
 
2.7%
9/12/2017927
 
2.7%
25/11/2017902
 
2.6%
18/11/2017866
 
2.5%
3/03/2018846
 
2.4%
6/01/2018787
 
2.3%
27/05/2017770
 
2.2%
23/09/2017742
 
2.1%
Other values (68)25987
74.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distance
Real number (ℝ≥0)

HIGH CORRELATION

Distinct215
Distinct (%)0.6%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean11.18492942
Minimum0
Maximum48.1
Zeros77
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:10.051182image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.7
Q16.4
median10.3
Q314
95-th percentile24.7
Maximum48.1
Range48.1
Interquartile range (IQR)7.6

Descriptive statistics

Standard deviation6.788892456
Coefficient of variation (CV)0.6069678403
Kurtosis3.585924276
Mean11.18492942
Median Absolute Deviation (MAD)3.85
Skewness1.503585816
Sum389861.9
Variance46.08906078
MonotonicityNot monotonic
2021-10-09T16:06:10.130722image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.21420
 
4.1%
13.8681
 
2.0%
9.2665
 
1.9%
7.8662
 
1.9%
10.5660
 
1.9%
8.4604
 
1.7%
4.6585
 
1.7%
14.7566
 
1.6%
5.2565
 
1.6%
11.4521
 
1.5%
Other values (205)27927
80.1%
ValueCountFrequency (%)
077
 
0.2%
0.729
 
0.1%
1.247
 
0.1%
1.330
 
0.1%
1.46
 
< 0.1%
1.529
 
0.1%
1.6194
0.6%
1.8152
0.4%
1.9148
0.4%
252
 
0.1%
ValueCountFrequency (%)
48.16
 
< 0.1%
47.47
 
< 0.1%
47.320
0.1%
45.933
0.1%
45.22
 
< 0.1%
44.220
0.1%
43.45
 
< 0.1%
43.36
 
< 0.1%
4118
0.1%
39.82
 
< 0.1%

Postcode
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct211
Distinct (%)0.6%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean3116.062859
Minimum3000
Maximum3978
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:10.208194image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum3000
5-th percentile3015
Q13051
median3103
Q33156
95-th percentile3204
Maximum3978
Range978
Interquartile range (IQR)105

Descriptive statistics

Standard deviation109.0239027
Coefficient of variation (CV)0.03498770971
Kurtosis22.78373808
Mean3116.062859
Median Absolute Deviation (MAD)52
Skewness4.018785705
Sum108613487
Variance11886.21137
MonotonicityNot monotonic
2021-10-09T16:06:10.283699image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3073844
 
2.4%
3046638
 
1.8%
3020617
 
1.8%
3121612
 
1.8%
3165583
 
1.7%
3058556
 
1.6%
3040535
 
1.5%
3204518
 
1.5%
3163508
 
1.5%
3012497
 
1.4%
Other values (201)28948
83.0%
ValueCountFrequency (%)
3000204
0.6%
300259
 
0.2%
300366
 
0.2%
300676
 
0.2%
300816
 
< 0.1%
3011375
1.1%
3012497
1.4%
3013304
0.9%
3015345
1.0%
3016240
0.7%
ValueCountFrequency (%)
39785
 
< 0.1%
397733
0.1%
39767
 
< 0.1%
39752
 
< 0.1%
391018
 
0.1%
381020
0.1%
38096
 
< 0.1%
38082
 
< 0.1%
38077
 
< 0.1%
380647
0.1%

Bedroom2
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct15
Distinct (%)0.1%
Missing8217
Missing (%)23.6%
Infinite0
Infinite (%)0.0%
Mean3.084647147
Minimum0
Maximum30
Zeros17
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:10.347491image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q12
median3
Q34
95-th percentile5
Maximum30
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9806897285
Coefficient of variation (CV)0.3179260647
Kurtosis26.80745531
Mean3.084647147
Median Absolute Deviation (MAD)1
Skewness1.406365679
Sum82175
Variance0.9617523437
MonotonicityNot monotonic
2021-10-09T16:06:10.401901image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
311881
34.1%
46348
18.2%
25777
16.6%
51427
 
4.1%
1966
 
2.8%
6168
 
0.5%
730
 
0.1%
017
 
< 0.1%
813
 
< 0.1%
95
 
< 0.1%
Other values (5)8
 
< 0.1%
(Missing)8217
23.6%
ValueCountFrequency (%)
017
 
< 0.1%
1966
 
2.8%
25777
16.6%
311881
34.1%
46348
18.2%
51427
 
4.1%
6168
 
0.5%
730
 
0.1%
813
 
< 0.1%
95
 
< 0.1%
ValueCountFrequency (%)
301
 
< 0.1%
201
 
< 0.1%
161
 
< 0.1%
121
 
< 0.1%
104
 
< 0.1%
95
 
< 0.1%
813
 
< 0.1%
730
 
0.1%
6168
 
0.5%
51427
4.1%

Bathroom
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct11
Distinct (%)< 0.1%
Missing8226
Missing (%)23.6%
Infinite0
Infinite (%)0.0%
Mean1.624798168
Minimum0
Maximum12
Zeros46
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:10.456925image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q32
95-th percentile3
Maximum12
Range12
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7242120115
Coefficient of variation (CV)0.4457242911
Kurtosis4.861008943
Mean1.624798168
Median Absolute Deviation (MAD)1
Skewness1.356293032
Sum43270
Variance0.5244830376
MonotonicityNot monotonic
2021-10-09T16:06:10.514242image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
112969
37.2%
211064
31.7%
32181
 
6.3%
4269
 
0.8%
577
 
0.2%
046
 
0.1%
616
 
< 0.1%
74
 
< 0.1%
83
 
< 0.1%
121
 
< 0.1%
(Missing)8226
23.6%
ValueCountFrequency (%)
046
 
0.1%
112969
37.2%
211064
31.7%
32181
 
6.3%
4269
 
0.8%
577
 
0.2%
616
 
< 0.1%
74
 
< 0.1%
83
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
121
 
< 0.1%
91
 
< 0.1%
83
 
< 0.1%
74
 
< 0.1%
616
 
< 0.1%
577
 
0.2%
4269
 
0.8%
32181
 
6.3%
211064
31.7%
112969
37.2%

Car
Real number (ℝ≥0)

MISSING
ZEROS

Distinct15
Distinct (%)0.1%
Missing8728
Missing (%)25.0%
Infinite0
Infinite (%)0.0%
Mean1.728845344
Minimum0
Maximum26
Zeros1631
Zeros (%)4.7%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:10.568437image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q32
95-th percentile4
Maximum26
Range26
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.010770785
Coefficient of variation (CV)0.5846507837
Kurtosis20.85932625
Mean1.728845344
Median Absolute Deviation (MAD)1
Skewness2.09517618
Sum45173
Variance1.021657581
MonotonicityNot monotonic
2021-10-09T16:06:10.628397image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
212214
35.0%
19164
26.3%
01631
 
4.7%
31606
 
4.6%
41161
 
3.3%
5151
 
0.4%
6140
 
0.4%
725
 
0.1%
823
 
0.1%
106
 
< 0.1%
Other values (5)8
 
< 0.1%
(Missing)8728
25.0%
ValueCountFrequency (%)
01631
 
4.7%
19164
26.3%
212214
35.0%
31606
 
4.6%
41161
 
3.3%
5151
 
0.4%
6140
 
0.4%
725
 
0.1%
823
 
0.1%
93
 
< 0.1%
ValueCountFrequency (%)
261
 
< 0.1%
181
 
< 0.1%
121
 
< 0.1%
112
 
< 0.1%
106
 
< 0.1%
93
 
< 0.1%
823
 
0.1%
725
 
0.1%
6140
0.4%
5151
0.4%

Landsize
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct1684
Distinct (%)7.3%
Missing11810
Missing (%)33.9%
Infinite0
Infinite (%)0.0%
Mean593.5989934
Minimum0
Maximum433014
Zeros2437
Zeros (%)7.0%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:10.800369image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1224
median521
Q3670
95-th percentile1001
Maximum433014
Range433014
Interquartile range (IQR)446

Descriptive statistics

Standard deviation3398.841946
Coefficient of variation (CV)5.725821614
Kurtosis11580.16251
Mean593.5989934
Median Absolute Deviation (MAD)210
Skewness96.02231136
Sum13680676
Variance11552126.58
MonotonicityNot monotonic
2021-10-09T16:06:10.873951image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
02437
 
7.0%
650204
 
0.6%
697123
 
0.4%
58597
 
0.3%
70086
 
0.2%
60484
 
0.2%
53481
 
0.2%
69680
 
0.2%
65268
 
0.2%
60068
 
0.2%
Other values (1674)19719
56.6%
(Missing)11810
33.9%
ValueCountFrequency (%)
02437
7.0%
13
 
< 0.1%
21
 
< 0.1%
32
 
< 0.1%
51
 
< 0.1%
101
 
< 0.1%
141
 
< 0.1%
152
 
< 0.1%
171
 
< 0.1%
181
 
< 0.1%
ValueCountFrequency (%)
4330141
< 0.1%
1466991
< 0.1%
890301
< 0.1%
800001
< 0.1%
760001
< 0.1%
751001
< 0.1%
445001
< 0.1%
428001
< 0.1%
414001
< 0.1%
405001
< 0.1%

BuildingArea
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED

Distinct740
Distinct (%)5.4%
Missing21115
Missing (%)60.6%
Infinite0
Infinite (%)0.0%
Mean160.2564004
Minimum0
Maximum44515
Zeros76
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:10.950136image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile56
Q1102
median136
Q3188
95-th percentile310
Maximum44515
Range44515
Interquartile range (IQR)86

Descriptive statistics

Standard deviation401.2670601
Coefficient of variation (CV)2.50390661
Kurtosis10877.52575
Mean160.2564004
Median Absolute Deviation (MAD)41
Skewness99.13257937
Sum2202243.454
Variance161015.2535
MonotonicityNot monotonic
2021-10-09T16:06:11.023383image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120185
 
0.5%
100161
 
0.5%
110159
 
0.5%
130153
 
0.4%
115149
 
0.4%
140142
 
0.4%
150136
 
0.4%
160123
 
0.4%
112123
 
0.4%
125119
 
0.3%
Other values (730)12292
35.3%
(Missing)21115
60.6%
ValueCountFrequency (%)
076
0.2%
0.011
 
< 0.1%
115
 
< 0.1%
220
 
0.1%
325
 
0.1%
46
 
< 0.1%
54
 
< 0.1%
71
 
< 0.1%
91
 
< 0.1%
101
 
< 0.1%
ValueCountFrequency (%)
445151
< 0.1%
67911
< 0.1%
61781
< 0.1%
46451
< 0.1%
36471
< 0.1%
35581
< 0.1%
31121
< 0.1%
20021
< 0.1%
15611
< 0.1%
11431
< 0.1%

YearBuilt
Real number (ℝ≥0)

MISSING

Distinct160
Distinct (%)1.0%
Missing19306
Missing (%)55.4%
Infinite0
Infinite (%)0.0%
Mean1965.289885
Minimum1196
Maximum2106
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:11.102671image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1196
5-th percentile1900
Q11940
median1970
Q32000
95-th percentile2013
Maximum2106
Range910
Interquartile range (IQR)60

Descriptive statistics

Standard deviation37.32817802
Coefficient of variation (CV)0.01899372622
Kurtosis10.89861685
Mean1965.289885
Median Absolute Deviation (MAD)30
Skewness-1.080913147
Sum30562223
Variance1393.392875
MonotonicityNot monotonic
2021-10-09T16:06:11.183326image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19701490
 
4.3%
19601260
 
3.6%
19501089
 
3.1%
1980726
 
2.1%
1900606
 
1.7%
2000571
 
1.6%
1920545
 
1.6%
1930531
 
1.5%
1910460
 
1.3%
1890444
 
1.3%
Other values (150)7829
22.5%
(Missing)19306
55.4%
ValueCountFrequency (%)
11961
 
< 0.1%
18001
 
< 0.1%
18201
 
< 0.1%
18301
 
< 0.1%
18504
 
< 0.1%
18542
 
< 0.1%
18551
 
< 0.1%
18561
 
< 0.1%
18572
 
< 0.1%
186011
< 0.1%
ValueCountFrequency (%)
21061
 
< 0.1%
20191
 
< 0.1%
20184
 
< 0.1%
201782
 
0.2%
2016130
 
0.4%
2015156
0.4%
2014212
0.6%
2013247
0.7%
2012333
1.0%
2011241
0.7%

CouncilArea
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct33
Distinct (%)0.1%
Missing3
Missing (%)< 0.1%
Memory size272.4 KiB
Boroondara City Council
3675 
Darebin City Council
2851 
Moreland City Council
 
2122
Glen Eira City Council
 
2006
Melbourne City Council
 
1952
Other values (28)
22248 

Length

Max length30
Median length22
Mean length21.73417685
Min length17

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowYarra City Council
2nd rowYarra City Council
3rd rowYarra City Council
4th rowYarra City Council
5th rowYarra City Council

Common Values

ValueCountFrequency (%)
Boroondara City Council3675
 
10.5%
Darebin City Council2851
 
8.2%
Moreland City Council2122
 
6.1%
Glen Eira City Council2006
 
5.8%
Melbourne City Council1952
 
5.6%
Banyule City Council1861
 
5.3%
Moonee Valley City Council1791
 
5.1%
Bayside City Council1764
 
5.1%
Brimbank City Council1593
 
4.6%
Monash City Council1466
 
4.2%
Other values (23)13773
39.5%

Length

2021-10-09T16:06:11.263204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
council34854
31.4%
city34550
31.1%
boroondara3675
 
3.3%
darebin2851
 
2.6%
moreland2122
 
1.9%
glen2006
 
1.8%
eira2006
 
1.8%
melbourne1952
 
1.8%
banyule1861
 
1.7%
valley1791
 
1.6%
Other values (31)23375
21.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Lattitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct13402
Distinct (%)49.9%
Missing7976
Missing (%)22.9%
Infinite0
Infinite (%)0.0%
Mean-37.8106343
Minimum-38.19043
Maximum-37.3902
Zeros0
Zeros (%)0.0%
Negative26881
Negative (%)77.1%
Memory size272.4 KiB
2021-10-09T16:06:11.337287image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-38.19043
5-th percentile-37.9485
Q1-37.86295
median-37.8076
Q3-37.7541
95-th percentile-37.67519
Maximum-37.3902
Range0.80023
Interquartile range (IQR)0.10885

Descriptive statistics

Standard deviation0.09027890451
Coefficient of variation (CV)-0.002387659086
Kurtosis1.544527049
Mean-37.8106343
Median Absolute Deviation (MAD)0.05448
Skewness-0.2576614223
Sum-1016387.66
Variance0.008150280599
MonotonicityNot monotonic
2021-10-09T16:06:11.418885image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-37.836125
 
0.1%
-37.842422
 
0.1%
-37.819820
 
0.1%
-37.795620
 
0.1%
-37.841418
 
0.1%
-37.796918
 
0.1%
-37.853617
 
< 0.1%
-37.794117
 
< 0.1%
-37.763417
 
< 0.1%
-37.812716
 
< 0.1%
Other values (13392)26691
76.6%
(Missing)7976
 
22.9%
ValueCountFrequency (%)
-38.190431
< 0.1%
-38.18561
< 0.1%
-38.184631
< 0.1%
-38.184181
< 0.1%
-38.184151
< 0.1%
-38.182551
< 0.1%
-38.181631
< 0.1%
-38.179281
< 0.1%
-38.178291
< 0.1%
-38.177451
< 0.1%
ValueCountFrequency (%)
-37.39021
< 0.1%
-37.39511
< 0.1%
-37.39781
< 0.1%
-37.399461
< 0.1%
-37.403491
< 0.1%
-37.40721
< 0.1%
-37.407441
< 0.1%
-37.407581
< 0.1%
-37.408531
< 0.1%
-37.408691
< 0.1%

Longtitude
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct14524
Distinct (%)54.0%
Missing7976
Missing (%)22.9%
Infinite0
Infinite (%)0.0%
Mean145.0018511
Minimum144.42379
Maximum145.52635
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:11.503930image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum144.42379
5-th percentile144.80008
Q1144.9335
median145.0078
Q3145.0719
95-th percentile145.1877
Maximum145.52635
Range1.10256
Interquartile range (IQR)0.1384

Descriptive statistics

Standard deviation0.1201687692
Coefficient of variation (CV)0.0008287395521
Kurtosis1.545947474
Mean145.0018511
Median Absolute Deviation (MAD)0.06832
Skewness-0.3948800169
Sum3897794.76
Variance0.01444053308
MonotonicityNot monotonic
2021-10-09T16:06:11.580205image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
144.996621
 
0.1%
144.99117
 
< 0.1%
144.98517
 
< 0.1%
145.010417
 
< 0.1%
144.967916
 
< 0.1%
144.991116
 
< 0.1%
145.000116
 
< 0.1%
145.024316
 
< 0.1%
144.99715
 
< 0.1%
144.999915
 
< 0.1%
Other values (14514)26715
76.6%
(Missing)7976
 
22.9%
ValueCountFrequency (%)
144.423791
< 0.1%
144.431621
< 0.1%
144.431811
< 0.1%
144.43941
< 0.1%
144.440511
< 0.1%
144.485711
< 0.1%
144.491
< 0.1%
144.49261
< 0.1%
144.5131
< 0.1%
144.52061
< 0.1%
ValueCountFrequency (%)
145.526351
< 0.1%
145.52371
< 0.1%
145.511371
< 0.1%
145.489851
< 0.1%
145.482731
< 0.1%
145.482461
< 0.1%
145.47791
< 0.1%
145.472821
< 0.1%
145.472621
< 0.1%
145.470521
< 0.1%

Regionname
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size272.4 KiB
Southern Metropolitan
11836 
Northern Metropolitan
9557 
Western Metropolitan
6799 
Eastern Metropolitan
4377 
South-Eastern Metropolitan
1739 
Other values (3)
 
546

Length

Max length26
Median length21
Mean length20.85631491
Min length16

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNorthern Metropolitan
2nd rowNorthern Metropolitan
3rd rowNorthern Metropolitan
4th rowNorthern Metropolitan
5th rowNorthern Metropolitan

Common Values

ValueCountFrequency (%)
Southern Metropolitan11836
34.0%
Northern Metropolitan9557
27.4%
Western Metropolitan6799
19.5%
Eastern Metropolitan4377
 
12.6%
South-Eastern Metropolitan1739
 
5.0%
Eastern Victoria228
 
0.7%
Northern Victoria203
 
0.6%
Western Victoria115
 
0.3%
(Missing)3
 
< 0.1%

Length

2021-10-09T16:06:11.652218image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-09T16:06:11.695581image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
metropolitan34308
49.2%
southern11836
 
17.0%
northern9760
 
14.0%
western6914
 
9.9%
eastern4605
 
6.6%
south-eastern1739
 
2.5%
victoria546
 
0.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Propertycount
Real number (ℝ≥0)

HIGH CORRELATION

Distinct342
Distinct (%)1.0%
Missing3
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean7572.888306
Minimum83
Maximum21650
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size272.4 KiB
2021-10-09T16:06:11.773838image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum83
5-th percentile2185
Q14385
median6763
Q310412
95-th percentile15510
Maximum21650
Range21567
Interquartile range (IQR)6027

Descriptive statistics

Standard deviation4428.090313
Coefficient of variation (CV)0.5847293839
Kurtosis0.8906876388
Mean7572.888306
Median Absolute Deviation (MAD)2823
Skewness0.9921002749
Sum263945449
Variance19607983.82
MonotonicityNot monotonic
2021-10-09T16:06:11.850281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21650844
 
2.4%
8870722
 
2.1%
10969583
 
1.7%
14949552
 
1.6%
10412491
 
1.4%
14577485
 
1.4%
10331467
 
1.3%
10579456
 
1.3%
11918444
 
1.3%
14887435
 
1.2%
Other values (332)29375
84.3%
ValueCountFrequency (%)
831
 
< 0.1%
1211
 
< 0.1%
1291
 
< 0.1%
2421
 
< 0.1%
2495
 
< 0.1%
2711
 
< 0.1%
2902
 
< 0.1%
3351
 
< 0.1%
3421
 
< 0.1%
38913
< 0.1%
ValueCountFrequency (%)
21650844
2.4%
17496204
 
0.6%
1738420
 
0.1%
1709347
 
0.1%
17055123
 
0.4%
16166178
 
0.5%
15542129
 
0.4%
15510255
 
0.7%
15321235
 
0.7%
14949552
1.6%

Interactions

2021-10-09T16:06:07.679233image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.068342image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.046961image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.883805image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.726881image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.660060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.494962image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.321599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.220455image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.028204image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.836785image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.829931image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.752019image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.152799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.120991image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.956468image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.800165image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.732821image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.566527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.389899image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.285455image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.093170image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.914796image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.902407image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.821039image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.335020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.188957image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.026576image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.963633image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.801610image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.634791image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.550650image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.351696image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.158846image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.989487image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.971588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.888180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.406765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.257460image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.095471image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.033050image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.870563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.703111image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.618107image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.418105image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.224307image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.157405image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.041518image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.960849image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.480297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.330370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.167256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.103483image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.941639image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.772012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.686123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.485657image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.293301image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.234126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.113655image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:08.029573image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.552853image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.401222image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.237633image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.174371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.013125image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.842429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.754916image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.558719image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.365215image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.309877image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.184337image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:08.095043image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.623521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.469489image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.305201image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.242267image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.080940image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.908895image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.821032image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.623040image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.433063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.383911image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.253591image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:08.158534image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.688129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.534967image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.370539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.307695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.146020image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.974829image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.884430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.691723image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.497777image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.454506image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.319109image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:08.222430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.753609image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.602529image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.441225image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.373842image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.214549image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.041423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.947920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.755145image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.566646image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.526582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.389219image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:08.287572image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.822090image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.668404image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.508548image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.440654image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.280811image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.107688image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.012063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.823170image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.629858image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.598685image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.458804image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:08.363326image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.903041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.745367image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.586924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.518108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.357895image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.185521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.087104image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.897411image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.703653image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.682148image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.538358image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:08.433384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:58.976752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:05:59.816190image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:00.659103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:01.589060image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:02.427639image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:03.255570image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.155027image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:04.964126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:05.771285image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:06.757338image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-10-09T16:06:07.610603image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-10-09T16:06:11.923184image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-10-09T16:06:12.039548image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-10-09T16:06:12.152052image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-10-09T16:06:12.351726image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-10-09T16:06:12.441263image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-10-09T16:06:08.668851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-10-09T16:06:08.929953image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-10-09T16:06:09.155582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-10-09T16:06:09.312246image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

SuburbAddressRoomsTypeMethodSellerGDateDistancePostcodeBedroom2BathroomCarLandsizeBuildingAreaYearBuiltCouncilAreaLattitudeLongtitudeRegionnamePropertycount
0Abbotsford68 Studley St2hSSJellis3/09/20162.53067.02.01.01.0126.0NaNNaNYarra City Council-37.8014144.9958Northern Metropolitan4019.0
1Abbotsford85 Turner St2hSBiggin3/12/20162.53067.02.01.01.0202.0NaNNaNYarra City Council-37.7996144.9984Northern Metropolitan4019.0
2Abbotsford25 Bloomburg St2hSBiggin4/02/20162.53067.02.01.00.0156.079.01900.0Yarra City Council-37.8079144.9934Northern Metropolitan4019.0
3Abbotsford18/659 Victoria St3uVBRounds4/02/20162.53067.03.02.01.00.0NaNNaNYarra City Council-37.8114145.0116Northern Metropolitan4019.0
4Abbotsford5 Charles St3hSPBiggin4/03/20172.53067.03.02.00.0134.0150.01900.0Yarra City Council-37.8093144.9944Northern Metropolitan4019.0
5Abbotsford40 Federation La3hPIBiggin4/03/20172.53067.03.02.01.094.0NaNNaNYarra City Council-37.7969144.9969Northern Metropolitan4019.0
6Abbotsford55a Park St4hVBNelson4/06/20162.53067.03.01.02.0120.0142.02014.0Yarra City Council-37.8072144.9941Northern Metropolitan4019.0
7Abbotsford16 Maugie St4hSNNelson6/08/20162.53067.03.02.02.0400.0220.02006.0Yarra City Council-37.7965144.9965Northern Metropolitan4019.0
8Abbotsford53 Turner St2hSBiggin6/08/20162.53067.04.01.02.0201.0NaN1900.0Yarra City Council-37.7995144.9974Northern Metropolitan4019.0
9Abbotsford99 Turner St2hSCollins6/08/20162.53067.03.02.01.0202.0NaN1900.0Yarra City Council-37.7996144.9989Northern Metropolitan4019.0

Last rows

SuburbAddressRoomsTypeMethodSellerGDateDistancePostcodeBedroom2BathroomCarLandsizeBuildingAreaYearBuiltCouncilAreaLattitudeLongtitudeRegionnamePropertycount
34847Wollert27 Birchmore Rd3hPIRay24/02/201825.53750.03.02.02.0383.0118.02016.0Whittlesea City Council-37.61940145.03951Northern Metropolitan2940.0
34848Wollert16 Gunther Wy4hShockingstuart24/02/201825.53750.04.02.02.0375.0NaNNaNWhittlesea City Council-37.61331145.03412Northern Metropolitan2940.0
34849Wollert35 Kingscote Wy3hSPRW24/02/201825.53750.03.02.02.0404.0158.02012.0Whittlesea City Council-37.61031145.03393Northern Metropolitan2940.0
34850Wollert15 Rockgarden Wy3hSPLJ24/02/201825.53750.03.02.02.0268.0135.02016.0Whittlesea City Council-37.61094145.04281Northern Metropolitan2940.0
34851Yarraville78 Bayview Rd3hSJas24/02/20186.33013.03.01.0NaN288.0NaNNaNMaribyrnong City Council-37.81095144.88516Western Metropolitan6543.0
34852Yarraville13 Burns St4hPIJas24/02/20186.33013.04.01.03.0593.0NaNNaNMaribyrnong City Council-37.81053144.88467Western Metropolitan6543.0
34853Yarraville29A Murray St2hSPSweeney24/02/20186.33013.02.02.01.098.0104.02018.0Maribyrnong City Council-37.81551144.88826Western Metropolitan6543.0
34854Yarraville147A Severn St2tSJas24/02/20186.33013.02.01.02.0220.0120.02000.0Maribyrnong City Council-37.82286144.87856Western Metropolitan6543.0
34855Yarraville12/37 Stephen St3hSPhockingstuart24/02/20186.33013.0NaNNaNNaNNaNNaNNaNMaribyrnong City CouncilNaNNaNWestern Metropolitan6543.0
34856Yarraville3 Tarrengower St2hPIRW24/02/20186.33013.02.01.00.0250.0103.01930.0Maribyrnong City Council-37.81810144.89351Western Metropolitan6543.0

Duplicate rows

Most frequently occurring

SuburbAddressRoomsTypeMethodSellerGDateDistancePostcodeBedroom2BathroomCarLandsizeBuildingAreaYearBuiltCouncilAreaLattitudeLongtitudeRegionnamePropertycount# duplicates
0Nunawading1/7 Lilian St3tSPJellis17/06/201715.43131.03.03.02.0405.0226.02000.0Manningham City Council-37.82678145.16777Eastern Metropolitan4973.02